AITopics | shai shalev-shwartz

Collaborating Authors

shai shalev-shwartz

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A Simple Practical Accelerated Method for Finite Sums

Aaron Defazio

Neural Information Processing SystemsMar-23-2026, 08:46:18 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, machine learning, proximal operator, (13 more...)

Neural Information Processing Systems

Country: Europe (0.28)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Accelerated Stochastic Greedy Coordinate Descent by Soft Thresholding Projection onto Simplex

Neural Information Processing SystemsNov-21-2025, 11:32:22 GMT

PrOjection (SOTOPO)" is proposed to exactly solve an In order to improve the convergence rate and reduce the iteration cost further, two important strategies are used in first-order methods: Nesterov's acceleration and stochastic optimization. Nesterov's acceleration is referred to the technique that uses some algebra trick to accelerate first-order algorithms; while stochastic optimization is referred to the method that samples one training This work is supported by the National Natural Science Foundation of China under grant Nos.

algorithm, artificial intelligence, machine learning, (16 more...)

Neural Information Processing Systems

Country:

Asia > China (0.24)
North America > United States > California > Los Angeles County > Long Beach (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

A Simple Practical Accelerated Method for Finite Sums

Neural Information Processing SystemsMar-12-2024, 11:02:23 GMT

We describe a novel optimization method for finite sums (such as empirical risk minimization problems) building on the recently introduced SAGA method. Our method achieves an accelerated convergence rate on strongly convex smooth problems. Our method has only one parameter (a step size), and is radically simpler than other accelerated methods for finite sums. Additionally it can be applied when the terms are non-smooth, yielding a method applicable in many areas where operator splitting methods would traditionally be applied.

algorithm, gradient descent, proximal operator, (11 more...)

Neural Information Processing Systems

Country:

Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Theoretical Foundations of Adversarially Robust Learning

Montasser, Omar

arXiv.org Artificial IntelligenceJun-13-2023

Despite extraordinary progress, current machine learning systems have been shown to be brittle against adversarial examples: seemingly innocuous but carefully crafted perturbations of test examples that cause machine learning predictors to misclassify. Can we learn predictors robust to adversarial examples? and how? There has been much empirical interest in this contemporary challenge in machine learning, and in this thesis, we address it from a theoretical perspective. In this thesis, we explore what robustness properties can we hope to guarantee against adversarial examples and develop an understanding of how to algorithmically guarantee them. We illustrate the need to go beyond traditional approaches and principles such as empirical risk minimization and uniform convergence, and make contributions that can be categorized as follows: (1) introducing problem formulations capturing aspects of emerging practical challenges in robust learning, (2) designing new learning algorithms with provable robustness guarantees, and (3) characterizing the complexity of robust learning and fundamental limitations on the performance of any algorithm.

artificial intelligence, inductive learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2306.07723

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)
North America > United States > California > San Francisco County > San Francisco (0.13)
North America > United States > Illinois > Cook County > Chicago (0.04)
(22 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.93)
(2 more...)

Add feedback

Understanding Machine Learning: From Theory to Algorithms: Shai Shalev-Shwartz, Shai Ben-David: 9781107057135: Amazon.com: Books

#artificialintelligenceFeb-17-2018, 20:54:57 GMT

But for those who have already got some basic ideas about the concepts of ML and the motivation for the theoretical justification of the algorithms, this is definitely should be the next book to read: it provides the rigorous proofs and presents the concepts and algorithms in clear mathematical language. There is no need to be scared though: the presentation of the stuff is excellent, the chapters are short enough in order to enable the reader to advance in reasonable steps (the book is derived from the lectures presented by both of the authors), there are excellent exercises.

artificial intelligence, machine learning, shai shalev-shwartz, (4 more...)

#artificialintelligence

Industry:

Retail > Online (0.44)
Education > Educational Setting > Online (0.43)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.44)

Add feedback

Accelerated Stochastic Greedy Coordinate Descent by Soft Thresholding Projection onto Simplex

Song, Chaobing, Cui, Shaobo, Jiang, Yong, Xia, Shu-Tao

Neural Information Processing SystemsDec-31-2017

In this paper we study the well-known greedy coordinate descent (GCD) algorithm to solve $\ell_1$-regularized problems and improve GCD by the two popular strategies: Nesterov's acceleration and stochastic optimization. Firstly, we propose a new rule for greedy selection based on an $\ell_1$-norm square approximation which is nontrivial to solve but convex; then an efficient algorithm called ``SOft ThreshOlding PrOjection (SOTOPO)'' is proposed to exactly solve the $\ell_1$-regularized $\ell_1$-norm square approximation problem, which is induced by the new rule. Based on the new rule and the SOTOPO algorithm, the Nesterov's acceleration and stochastic optimization strategies are then successfully applied to the GCD algorithm. The resulted algorithm called accelerated stochastic greedy coordinate descent (ASGCD) has the optimal convergence rate $O(\sqrt{1/\epsilon})$; meanwhile, it reduces the iteration complexity of greedy selection up to a factor of sample size. Both theoretically and empirically, we show that ASGCD has better performance for high-dimensional and dense problems with sparse solution.

algorithm, artificial intelligence, machine learning, (15 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.89)

Add feedback

SelfieBoost: A Boosting Algorithm for Deep Learning

Shalev-Shwartz, Shai

arXiv.org Machine LearningApr-8-2017

We describe and analyze a new boosting algorithm for deep learning called SelfieBoost. Unlike other boosting algorithms, like AdaBoost, which construct ensembles of classifiers, SelfieBoost boosts the accuracy of a single network. We prove a $\log(1/\epsilon)$ convergence rate for SelfieBoost under some "SGD success" assumption which seems to hold in practice.

artificial intelligence, machine learning, neural network, (13 more...)

arXiv.org Machine Learning

1411.3436

Country:

North America > Canada > Ontario > Toronto (0.14)
Asia > Middle East > Israel (0.14)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Simple Practical Accelerated Method for Finite Sums

Defazio, Aaron

Neural Information Processing SystemsDec-31-2016

Abstract We describe a novel optimization method for finite sums (such as empirical risk minimization problems) building on the recently introduced SAGA method. Our method achieves an accelerated convergence rate on strongly convex smooth problems. Our method has only one parameter (a step size), and is radically simpler than other accelerated methods for finite sums. Additionally it can be applied when the terms are non-smooth, yielding a method applicable in many areas where operator splitting methods would traditionally be applied.

artificial intelligence, machine learning, operator, (14 more...)

Neural Information Processing Systems

Country: Europe (0.28)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Importance Sampling for Minibatches

Csiba, Dominik, Richtárik, Peter

arXiv.org Machine LearningFeb-6-2016

Supervised learning is a widely adopted learning paradigm with important applications such as regression, classification and prediction. The most popular approach to training supervised learning models is via empirical risk minimization (ERM). In ERM, the practitioner collects data composed of example-label pairs, and seeks to identify the best predictor by minimizing the empirical risk, i.e., the average risk associated with the predictor over the training data. With ever increasing demand for accuracy of the predictors, largely due to successful industrial applications, and with ever more sophisticated models that need to trained, such as deep neural networks [8, 14], or multiclass classification [9], increasing volumes of data are used in the training phase. This leads to huge and hence extremely computationally intensive ERM problems. Batch algorithms--methods that need to look at all the data before taking a single step to update the predictor--have long been known to be prohibitively impractical to use. Typical examples of batch methods are gradient descent and classical quasi-Newton methods.

artificial intelligence, machine learning, minibatch, (19 more...)

arXiv.org Machine Learning

1602.02283

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.36)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

Filters

Collaborating Authors

shai shalev-shwartz

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

A Simple Practical Accelerated Method for Finite Sums

Accelerated Stochastic Greedy Coordinate Descent by Soft Thresholding Projection onto Simplex

84b20b1f5a0d103f5710bb67a043cd78-Paper.pdf

A Simple Practical Accelerated Method for Finite Sums

Theoretical Foundations of Adversarially Robust Learning

Understanding Machine Learning: From Theory to Algorithms: Shai Shalev-Shwartz, Shai Ben-David: 9781107057135: Amazon.com: Books

Accelerated Stochastic Greedy Coordinate Descent by Soft Thresholding Projection onto Simplex

SelfieBoost: A Boosting Algorithm for Deep Learning

A Simple Practical Accelerated Method for Finite Sums

Importance Sampling for Minibatches